MLSLib: A Lip Sync Library for Multi Agents and Languages

نویسندگان

H. Murakami

Hiromi Baba

Tsukasa Noma

چکیده

This article presents MLSLib, a software library for human gure animation with lip syncing. The library enables us to easily use multiple TTS systems and multiple lip motion generators, and switch them arbitrarily. It also helps use of multiple speaking agents, possibly with di erent TTS systems and lip motion generators. The MLSLib is composed of three modules: LSSAgent, TTSManager, and FCPManager; The LSSAgent module provides uni ed simple APIs per single agent, independent of TTS systems and lip motion generators. The TTSManager and FCPManager manage TTS systems and lip motion generators, respectively. Both modules support standard sets of phonetic alphabets per language, and thus users are freed from TTS-dependent implementation of lip motion generators. Applications to multi-lingual agents and LOD in lip syncing are also presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Animating Lip-Sync Characters

Speech animation is traditionally considered as important but tedious work for most applications, especially when taking lip synchronization (lip-sync) into consideration, because the muscles on the face are complex and interact dynamically. Although there are several methods proposed to ease the burden on artists to create facial and speech animation, almost none are fast and efficient. In thi...

متن کامل

Method for Custom Facial Animation and Lip-Sync in an Unsupported Environment, Second LifeTM

The virtual world of Second LifeTM does not offer support for complex facial animations, such as those needed for an intelligent virtual agent to lip sync to audio clips. However, it is possible to access a limited range of default facial animations through the native scripting language, LSL. Our solution to produce lip sync in this environment is to rapidly trigger and stop these default anima...

متن کامل

Detecting audio-visual synchrony using deep neural networks

In this paper, we address the problem of automatically detecting whether the audio and visual speech modalities in frontal pose videos are synchronous or not. This is of interest in a wide range of applications, for example spoof detection in biometrics, lip-syncing, speaker detection and diarization in multi-subject videos, and video data quality assurance. In our adopted approach, we investig...

متن کامل

ObamaNet: Photo-realistic lip-sync from text

We present ObamaNet, the first architecture that takes any text as input and generates both the corresponding speech and synchronized photo-realistic lip-sync videos. Contrary to other published lip-sync approaches, ours is only composed of fully trainable neural modules and does not rely on any traditional computer graphics methods. More precisely, we use three main modules: a text-to-speech n...

متن کامل

Automated Gesturing for Embodied Animated Agent: Speech-driven and Text-driven Approaches

We present two methods for automatic facial gesturing of graphically embodied animated agents. In one case, conversational agent is driven by speech in automatic Lip Sync process. By analyzing speech input, lip movements are determined from the speech signal. Another method provides virtual speaker capable of reading plain English text and rendering it in a form of speech accompanied by the app...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

MLSLib: A Lip Sync Library for Multi Agents and Languages

نویسندگان

چکیده

منابع مشابه

Animating Lip-Sync Characters

Method for Custom Facial Animation and Lip-Sync in an Unsupported Environment, Second LifeTM

Detecting audio-visual synchrony using deep neural networks

ObamaNet: Photo-realistic lip-sync from text

Automated Gesturing for Embodied Animated Agent: Speech-driven and Text-driven Approaches

عنوان ژورنال:

اشتراک گذاری